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ABSTRACT 

Ordinary least-squares regression treats the 
variables asymmetrically, designating a dependent variable and one or 
more independent variables- When it is not obvious how to make this 
distinction, a researcher may prefer to use orthogonal regression, 
which treats the variables symmetrically- However, the usual 
procedure for orthogonal regression is not equivariant. A simple 
modification is proposed to overcome this serious defect. 
Illustrative computations involving 15 observations on 5 variables 
are provided, and a robust version of the method is discussed- The 
modified orthogonal regression allows a researcher to explore a 
symmetric, equivariant, and robust linear relationship among a set 
variables- (Contains 6 references-) (Author/SLD) 
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Abstract . Ordinary least-squares regression treats the 

variables asymmetrically, designating a dependent variable 
and one or more independent variables. When it is not 
obvious how to make this distinction, a researcher may 
prefer to use orthogonal regression, which treats the 
variables symmetrically. However, the usual procedure for 
orthogonal regression is not equivariant. We propose a 
simple modification to overcome this serious defect. 
Illustrative computations are provided, and a robust 
version of our method is discussed. 

Key words ; least squares regression, orthogonal regression, 
equivariance, robust estimation. 
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Orthogonal Regression and Equivariance 
1, Intro duction 

To use ordinary least squares, one designates a dependent 
variable and one or more independent variables. This decision 
implies that the random error affects only the dependent 
variable. The choice of the dependent variable will usually be 
crucial for parameter estimates and the outcome of hypothesis 
tests. Sometimes considerations of cause and effect make it clear 
which variable is dependent and which are independent. Often, 
however, a researcher has no such preconception and prefers to 
treat the variables symmetrically. 

In that case, each variable is equally subject to the random 
error. An appropriate linear model is orthogonal regression, 
where the error is not measured along one axis. Instead it is 
measured perpendicular to the regression plane itself, the usual 
Euclidean notion of the distance from a point to a line [Morrison 
(1990), chapter 8]. 

Despite its appealing symmetry, this method has a major 
disadvantage: the coefficients in an orthogonal regression are 
not equivariant ; they change in a complicated way when a variable 
is rescaled. A choice of units can make a single variable 
dominate the regression. Moreover, "standardization" begs the 
question of equivariance since it is just one of many ways to 
transform the variables into dimensionless numbers. Each such 
transformation produces a different orthogonal regression, and 
the relationships among the various regressions are not 
straightforward [Malinvaud (1966), chapter 1]. 

This lack of equivariance is evidently unsatisfactory. To 
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some extent, it explains the popularity of ordinary least 
squares, where the regression coefficients adjust in an obvious 
and harmless way when any variable is rescaled [Morrison (1990) , 
chapter 3 ] • 

We now propose a simple modification which makes orthogonal 
regression equivariant. This result is discussed in section 2, 
where a robust version is also described. Illustrative 
computations are provided in section 3. 
2. A least-squares solution 

Suppose that a data matrix X contains n joint observations on 
K variables (n > K) . For convenience, all the variables are 
measured as deviations from their sample means. In the matrix 
equation 

Xb = u , (1) 
b is a column vector of K regression coefficients and u is a 
column vector of n residuals. Orthogonal regression selects 
b to minimize the residual sum of squares 

b'X'Xb = u f u . (2) 
A normalization is imposed to avoid the trivial solution b = 
0. Conventionally, b is constrained to lie on the unit sphere: 
b'b = 1 . (3) 
It then follows that b is the eigenvector corresponding to 
the smallest eigenvalue of X'X. However, we have emphasized that 
this solution lacks equivariance. Let us instead adopt the 
normalization 

b'e = 1 , (4) 
where e is a column vector of K units. Accordingly, the sum of 
the regression coefficients is one. The Lagrangian expression 
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b'X'Xb -2L(b'e -1) 



(5) 



has a unique minimum at 

L - l/e« (X , X)^ 1 e 
and b - L(X , X)'" 1 e • 



(6) 



(7) 



Equations (6) and (7) are the modified orthogonal regression 
which we propose; Raj [(1968), 16-17] has called this solution 
the "best weight function." Any computer software that handles 
matrices can easily calculate L and b. In fact ; many statistical 
programs compute and display (X'X)-* 1 . We remark that the Lagrange 
multiplier L equals the minimum sum of squared residuals. 

The coefficient vector b is equivariant in the following 
sense. Suppose that each observation on the first X variable is 
multiplied by a positive constant c. This rescaling means that 
the first row of X'X is multiplied by c; then the first column of 
X'X is multiplied by c. Consequently, the first row of (X'X)"" 1 is 
multiplied by 1/c; then the first column of (X'X)" 1 is multiplied 
by 1/c. 

If we now replace the first element of e by c, the 
normalization (4) becomes 

cb l + b 2 + . . . + b K = 1 . (8) 

Then the rescaling has no effect on L in equation (6) . In 

equation (7) , b 1 ± s divided by c; but no other coefficient is 
altered. In summary, the rescaling affects our modified 
orthogonal regression just as it affects ordinary least squares. 

Of course, it would usually be pointless to rescale an X 
variable and then nullify the effect by renormalizing, as in 
equation (8) . Our intention is merely to show that the choice of 
units for an X variable is not a substantive decision, as indeed 
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it should not be. We remark that Srinivasan (1976) uses a 
normalization like (8) in the context of ordinal regression* 

If the X variables are not' measured as deviations from the 
sample means, the model may require an intercept* It is computed 
as usual by passing the plane through the point of sample means 
[Malinvaud (1966), chapter !]• 

When the X matrix may be contaminated by "outliers, 11 a robust 
version of equations (6) and (7) can be calculated by the linear 
program 

Maximize L subject to 

sx ik D i + L = 0 for k = 1, ... , K (9) 

and -1 < Di < 1 for i = 1, ... , n . 

In (9) , the summation over i runs from 1 to n. L is again the 
Lagrange multiplier for normalization (4) . At the optimum, L 
equals the minimum sum of the absolute value of the residuals, 
E|ujJ. The residuals themselves are listed as "reduced costs." A 
variable = +1 or -1 if the corresponding observation i lies 
above or below the regression plane; if the observation i lies 
right on the plane, then -1 < < 1. 

There are K constraints like (9) , and the linear program 
reports a "dual variable" for each of them. These dual variables 
are the regression coefficients. To accommodate an intercept, the 
linear program may include constraint K+l: SD^ = 0. The solution 
by linear programming is related to (6) and (7) as a median is 
related to a mean, and this accounts for the robustness in the 
presence of outliers [Wagner (1959), Dodge (1987)]. 
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3. Illustrative calculations 

To illustrate equations (6), (7) and (8), we use some 
hypothetical data involving fifteen observations on five 
variables (n = 15, K = 5) . The matrix X is: 
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For (X'X)" 1 we have 
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In equation (6) , the Lagrange multiplier is the reciprocal of 
the sum of the elements of (X'X)"* 1 . For our example, L = 0.1329. 
In equation (7), b contains the five row sums of (X'X)"" 1 ' each 
row sum having been multiplied by L: 

b= (3.7420, 1.4065, -.4219, .0777, -3.8043)' . (10) 
or 3.7420X X +1.4065X 2 -.4219X 3 +.0777X 4 -3.8043X 5 = 0 . 
Of course, any variable may be expressed in terms of the others; 
for example: 

X 2 = -2.6605X! +0.3000X 3 -•0552X4 +2.7048X 5 . 

To illustrate equivariance, we multiply each observation on 
the first variable by ten. The new X'X = 
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In equations (6) and (7) , we replace the unit vector e by 
(10, 1, 1, 1, 1) 1 and again obtain L = .1329. The regression 
coefficients are 

b = (.3742, 1.4065, -.4219, .0777, -3.8043) 1 . (11) 
A comparison of (10) and (11) shows that the first 
coefficient has been divided by the scale factor of ten, but the 
other coefficients are unchanged. These results may also be 
compared with the coefficients in the usual orthogonal regression 
obtained from the smallest eigenvalue of X'X. Before the first 
variable is rescaled by ten, the eigenvector containing the 
regression coefficients is 

(.6804, .2518, -.0868, .0149, -.6826) . (12) 
After the first variable is rescaled by ten, the eigenvector 

is 

(.0913, .3445, -.1056, .0170, -.9282) . (13) 

The two eigenvectors, (12) and (13), are not related to one 
another by a straightforward transformation. On the other hand, 
the relationship between (10) and (11) is transparent. 

The linear program for the robust orthogonal regression is 
shown below. An intercept (B0) has been included. The regression 
coefficients are not very different from (10) , nor do there 
appear to be exceptionally large residuals in the column labeled 
REDUCED COST. It is therefore unlikely that the X matrix is 
contaminated by stray observations. 

In conclusion, our modified orthogonal regression allows a 
researcher to explore a symmetric, equiva -riant and robust linear 
relationship among a set of variables. 
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Linear program for robust orthogonal regression 
Maximize L subject to: 
1.2489*D1+.2 365*D2-.3627*D3+1.5916*D4+.8176*D5~2 . 3717*D6-. 1758*D7- 
. 2694*D8+ . 2092*D9- . 0537*D10-1 . 3818*D11- . 1127*D12- . 634 *D13+ . 6595*D14+ 

.5984*D15+L=0 

-1.223 3*D1+.5172*D2-. 15*D3+1 . 8516*D4+ . 9119*D5+ . 1574*D6+ . 4104*D7- 
2.1325*D8+1. 1412*D9+1.8174*D10-2.0502*D11-1.27 2 3*D12-1.0411*D13+ 
. 1888*D14+.8735*D15+L=0 

1.1348*D1+. 1794 *D2-. 284 *D3+1. 3597*D4+.7665*D5-2 . 1992*D6- . 2686*D7- 
,5668*D8+. 5606*D9+. 2 053 *D10-1. 5132 *D11-.0885*D12-. 608 *D13+.8501*D14+ 

.4719*D15+L=0 

-1. 2265*D1+. 3618*D2-. 1923 *D3+1. 7249*D4+. 5742 *D5+. 0626*D6+. 1748*D7- 
1.7472*D8+.964 2*D9+2.0861*D10-2.5885*Dll-.9694*D12-.515*D13+.5806* 

D14+.7098*D15+L=0 

. 6205*D1+.4 656*D2-. 3981*D3+2. 14 22*D4+1. 017*D5-2 . 0325*D6-. 0274 *D7- 
1.0196*D8+. 5874 *D9+. 657 5*D10-1.9695*D11-. 593 *D12-.9685*D13+.642*D14+ 

.8765*D15+L=0 
(D1+. .+D15) =0 
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1>=D1>=-1 

1>=D2>=-1 

1>=D3>=-1 

1>=D4>=-1 

1>=D5>=-1 

1>=D6>=-1 

1>=D7>=-1 

1>=D8>=-1 

1>=D9>=-1 

1>=D10>=-1 

1>=D11>=-1 

1>=D12>=-1 

1>=D13>=-1 

1>=D14>=-1 

1>=D15>=-1 



REDUCED 

VARIABLE COST 

Dl -1.0000000 -.002865 

D2 1.0000000 .201251 

D3 -1.0000000 -.043831 

D4 -.44177346 .000000 

D5 1.0000000 -.191481 

D6 -.69180276 .000000 

D7 -1.0000000 -.142338 

D8 1.0000000 .040054 

D9 .69571016 .000000 

D10 1.0000000 .030376 

Dll 1.0000000 .183603 

D12 .66671341 .000000 

D13 -1.0000000 -.068614 

D14 -.22884735 .000000 

D15 1.0000000 .002487 

L .90689890 .000000 



Bl 3.6788988 

B2 1.3488624 

B3 -.41810227 

B4 .12282872 

B5 -3.7324877 

BO -.00052750 
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